Co-STAR: A Co-training Style Algorithm for Hyponymy Relation Acquisition from Structured and Unstructured Text
نویسندگان
چکیده
This paper proposes a co-training style algorithm called Co-STAR that acquires hyponymy relations simultaneously from structured and unstructured text. In CoSTAR, two independent processes for hyponymy relation acquisition – one handling structured text and the other handling unstructured text – collaborate by repeatedly exchanging the knowledge they acquired about hyponymy relations. Unlike conventional co-training, the two processes in Co-STAR are applied to different source texts and training data. We show the effectiveness of this algorithm through experiments on largescale hyponymy-relation acquisition from Japanese Wikipedia and Web texts. We also show that Co-STAR is robust against noisy training data.
منابع مشابه
Bilingual Co-Training for Monolingual Hyponymy-Relation Acquisition
This paper proposes a novel framework called bilingual co-training for a largescale, accurate acquisition method for monolingual semantic knowledge. In this framework, we combine the independent processes of monolingual semanticknowledge acquisition for two languages using bilingual resources to boost performance. We apply this framework to largescale hyponymy-relation acquisition from Wikipedi...
متن کاملExtracting Hyponymic Relations from Chinese Free Corpus_Finally 分栏 精简版_5.rtf
Research on hyponymy acquisition is a basic and crucial problem in knowledge acquisition from text. In this paper we present a method of hyponymic relation acquisition and verification based on Chinese lexico-syntactic patterns. Firstly, we make use of removable lexicons and sentence patterns that have been semi-automatically obtained to analyze Chinese-isa patterns. Then we use an algorithm th...
متن کاملDiscovering Multi Terms and Co-hyponymy from XHTML Documents with XTREEM
The Semantic Web needs ontologies as an integral component. Current methods for learning and enhancing ontologies, need to be further improved to overcome the knowledge acquisition bottleneck. The identification of concepts and relations with only minimal user interaction is still a challenging objective. Current approaches performed to extract semantics often use association rules or clusterin...
متن کاملA hybrid neural–genetic algorithm for predicting pure and impure CO2 minimum miscibility pressure
"> Accurate prediction of the minimum miscibility pressure (MMP) in a gas injection process is crucial to optimizing the management of gas injection projects. Because the <span style="font-size: 10pt; ...
متن کاملAcquiring Hyponymy Relations from Web Documents
This paper describes an automatic method for acquiring hyponymy relations from HTML documents on the WWW. Hyponymy relations can play a crucial role in various natural language processing systems. Most existing acquisition methods for hyponymy relations rely on particular linguistic patterns, such as “NP such as NP”. Our method, however, does not use such linguistic patterns, and we expect that...
متن کامل